Search CORE

18 research outputs found

Mira: A Framework for Static Performance Analysis

Author: Meng Kewen
Norris Boyana
Publication venue
Publication date: 22/05/2017
Field of study

The performance model of an application can pro- vide understanding about its runtime behavior on particular hardware. Such information can be analyzed by developers for performance tuning. However, model building and analyzing is frequently ignored during software development until perfor- mance problems arise because they require significant expertise and can involve many time-consuming application runs. In this paper, we propose a fast, accurate, flexible and user-friendly tool, Mira, for generating performance models by applying static program analysis, targeting scientific applications running on supercomputers. We parse both the source code and binary to estimate performance attributes with better accuracy than considering just source or just binary code. Because our analysis is static, the target program does not need to be executed on the target architecture, which enables users to perform analysis on available machines instead of conducting expensive exper- iments on potentially expensive resources. Moreover, statically generated models enable performance prediction on non-existent or unavailable architectures. In addition to flexibility, because model generation time is significantly reduced compared to dynamic analysis approaches, our method is suitable for rapid application performance analysis and improvement. We present several scientific application validation results to demonstrate the current capabilities of our approach on small benchmarks and a mini application

arXiv.org e-Print Archive

Crossref

Phosphate glass fibers facilitate proliferation and osteogenesis through Runx2 transcription in murine osteoblastic cells

Author: Ahmed Ifty
Boccaccini Aldo R
Chen Qiang
Gao Yongguang
Li Hui
Li Meng
Lin Xiao
Liu Xianhu
Qian Airong
Qiu Wuxia
Xiao Yunyun
Zhang Kewen
Publication venue: 'Wiley'
Publication date: 01/02/2020
Field of study

Cell-material interactions and compatibility are important aspects of bioactive materials for bone tissue engineering. Phosphate glass fiber (PGF) is an attractive inorganic filler with fibrous structure and tunable composition, which has been widely investigated as a bioactive filler for bone repair applications. However, the interaction of osteoblasts with PGFs has not been widely investigated to elucidate the osteogenic mechanism of PGFs. In this study, different concentrations of short PGFs with interlaced oriented topography were co-cultured with MC3T3-E1 cells for different periods, and the synergistic effects of fiber topography and ionic product of PGFs on osteoblast responses including cell adhesion, spreading, proliferation and osteogenic differentiation were investigated. It was found that osteoblasts were more prone to adhere on PGFs through vinculin protein, leading to enhanced cell proliferation with polygonal cell shape and spreading cellular actin filaments. In addition, osteoblasts incubated on PGF meshes showed enhanced alkaline phosphatase (ALP) activity, extracellular matrix mineralization, and increased expression of osteogenesis-related marker genes, which could be attributed to the Wnt/β-catenin/Runx2 signaling pathway. This study elucidated the possible mechanism of PGF on triggering specific osteoblast behavior, which would be highly beneficial for designing PGF-based bone graft substitutes with excellent osteogenic functions

Crossref

Repository@Nottingham

Compiler-Assisted Program Modeling for Performance Tuning of Scientific Applications

Author: Meng Kewen
Publication venue: University of Oregon
Publication date: 23/11/2021
Field of study

Application performance models are important for both software and hardware development. They can be used to understand and improve application performance, to determine what architectural features are important to a particular program component, or to guide the design of new architectures. Creating accurate performance models of most computations typically requires significant expertise, human effort, and computational resources. Moreover, even when performed by experts, it is necessarily limited in scope, accuracy, or both. This research considers a number of novel static program analysis techniques to create performance-related program representations of high-performance computations. These program representations can be used to model performance or to support efficient and accurate matching of computational kernels. We develop two different tools for static analysis-based program representation and demonstrate how they can be used for the optimization of scientific applications

University of Oregon Scholars' Bank

Hyperspectral Super-Resolution Via Joint Regularization of Low-Rank Tensor Decomposition

Author: Kewen Qu
Meng Cao
Wenxing Bao
Publication venue: 'MDPI AG'
Publication date: 14/10/2021
Field of study

The hyperspectral image super-resolution (HSI-SR) problem aims at reconstructing the high resolution spatial–spectral information of the scene by fusing low-resolution hyperspectral images (LR-HSI) and the corresponding high-resolution multispectral image (HR-MSI). In order to effectively preserve the spatial and spectral structure of hyperspectral images, a new joint regularized low-rank tensor decomposition method (JRLTD) is proposed for HSI-SR. This model alleviates the problem that the traditional HSI-SR method, based on tensor decomposition, fails to adequately take into account the manifold structure of high-dimensional HR-HSI and is sensitive to outliers and noise. The model first operates on the hyperspectral data using the classical Tucker decomposition to transform the hyperspectral data into the form of a three-mode dictionary multiplied by the core tensor, after which the graph regularization and unidirectional total variational (TV) regularization are introduced to constrain the three-mode dictionary. In addition, we impose the l1-norm on core tensor to characterize the sparsity. While effectively preserving the spatial and spectral structures in the fused hyperspectral images, the presence of anomalous noise values in the images is reduced. In this paper, the hyperspectral image super-resolution problem is transformed into a joint regularization optimization problem based on tensor decomposition and solved by a hybrid framework between the alternating direction multiplier method (ADMM) and the proximal alternate optimization (PAO) algorithm. Experimental results conducted on two benchmark datasets and one real dataset show that JRLTD shows superior performance over state-of-the-art hyperspectral super-resolution algorithms

Multidisciplinary Digital Publishing Institute

(Figure 11) Comparison between the experimental and calculated viscosity of mixed oil

Author: Gao Yuanping
Li Kewen
Meng Qingping
Publication venue: PANGAEA
Publication date: 15/12/2011
Field of study

Publishing Network for Geoscientific and Environmental Data

Multispectral and Hyperspectral Image Fusion Based on Regularized Coupled Non-Negative Block-Term Tensor Decomposition

Author: Hao Guo
Kewen Qu
Meng Cao
Wenxing Bao
Xuan Ma
Publication venue: MDPI AG
Publication date: 01/10/2022
Field of study

The problem of multispectral and hyperspectral image fusion (MHF) is to reconstruct images by fusing the spatial information of multispectral images and the spectral information of hyperspectral images. Focusing on the problem that the hyperspectral canonical polyadic decomposition model and the Tucker model cannot introduce the physical interpretation of the latent factors into the framework, it is difficult to use the known properties and abundance of endmembers to generate high-quality fusion images. This paper proposes a new fusion algorithm. In this paper, a coupled non-negative block-term tensor model is used to estimate the ideal high spatial resolution hyperspectral images, its sparsity is characterized by adding 1-norm, and total variation (TV) is introduced to describe piecewise smoothness. Secondly, the different operators in two directions are defined and introduced to characterize their piecewise smoothness. Finally, the proximal alternating optimization (PAO) algorithm and the alternating multiplier method (ADMM) are used to iteratively solve the model. Experiments on two standard datasets and two local datasets show that the performance of this method is better than the state-of-the-art methods

Directory of Open Access Journals

Multispectral and Hyperspectral Image Fusion Based on Regularized Coupled Non-Negative Block-Term Tensor Decomposition

Author: Hao Guo
Kewen Qu
Meng Cao
Wenxing Bao
Xuan Ma
Publication venue: 'MDPI AG'
Publication date: 23/10/2022
Field of study

Multidisciplinary Digital Publishing Institute

Using a Delphi method and the analytic hierarchy process to evaluate Chinese search engines

Author: Dirk Lewandowski
Fei Meng
Jia Tina Du
Kewen Wu
Qinghua Zhu
Xiaoling Sun
Publication venue: 'Emerald'
Publication date
Field of study

Crossref

Metal surface defect detection based on improved YOLOv5

Author: Chuande Zhou
Hailun Zuo
Kang Liu
Kewen Xia
Minghui Meng
Yonghu Tan
Zhenyu Lu
Zhongliang Lv
Publication venue: Nature Portfolio
Publication date: 01/11/2023
Field of study

Abstract During the production of metal material, various complex defects may come into being on the surface, together with large amount of background texture information, causing false or missing detection in the process of small defect detection. To resolve those problems, this paper introduces a new model which combines the advantages of CSPlayer module and Global Attention Enhancement Mechanism based on the YOLOv5s model. First of all, we replace C3 module with CSPlayer module to augment the neural network model, so as to improve its flexibility and adaptability. Then, we introduce the Global Attention Mechanism (GAM) and build the generalized additive model. In the meanwhile, the attention weights of all dimensions are weighted and averaged as output to promote the detection speed and accuracy. The results of the experiment in which the GC10-DET augmented dataset is involved, show that the improved algorithm model performs better than YOLOv5s in precision, [email protected] and [email protected]: 0.95 by 5.3%, 1.4% and 1.7% respectively, and it also has a higher reasoning speed

Directory of Open Access Journals